Skip to content

Model shapes config #2036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 23, 2025
Merged

Model shapes config #2036

merged 14 commits into from
Apr 23, 2025

Conversation

jainapurva
Copy link
Contributor

@jainapurva jainapurva commented Apr 10, 2025

This pull request introduces significant updates to the microbenchmarking framework, focusing on new model types, and enhanced shape generation options. The changes aim to expand functionality, and more extensive benchmarking configurations.

Enhancements to Model Types and Shape Generation

  • Added support for new model types, including ln_linear_<activation> (e.g., sigmoid, relu, gelu) and transformer_block with self-attention and MLP. These are documented in benchmarks/microbenchmarks/README.md.
  • Introduced multiple shape generation options (custom, llama, pow2, pow2_extended, sweep) to support diverse matrix shapes for benchmarking. These options are implemented in benchmark_runner.py and documented in the README.

Refactoring and Code Simplification

  • Refactored model creation logic by replacing create_model_and_input with create_model_and_input_data, now imported from torchao.testing.model_architectures. This centralizes model definitions and input data generation.
  • Removed redundant model definitions (ToyLinearModel, LNLinearSigmoid) from utils.py, consolidating them into torchao.testing.model_architectures.

Future TODO: Refactor Torchao to use model definitions from torchao.testing.model_architectures, #2078

Updates to Configuration

  • Expanded benchmark_config.yml to include configurations for new model types and shape generation options, such as llama and pow2.

Documentation Improvements

  • Updated README.md to provide detailed descriptions of new model types and shape generation options, ensuring users can easily understand and utilize the new features.

These changes collectively enhance the flexibility, maintainability, and usability of the benchmarking framework.

Sample configuration.yml for inference benchmarks

benchmark_mode: "inference"
quantization_config_recipe_names:
  - "float8dq"
  - "float8wo"
output_dir: "benchmarks/microbenchmarks/results"
model_params:
  - name: "ln_linear_sigmoid_cuda"
    matrix_shapes:
      - name: "custom"
        shapes: [
          [2048, 4096, 1024],
        ]
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "ln_linear_sigmoid"
    enable_profiler: true

  - name: "bf16_transformer_block"
    matrix_shapes:
      - name: "custom"
        shapes: [
          [2048, 4096, 1024],  # For transformer_block, k is the hidden dimension
        ]
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "transformer_block" # TODO: Add a custom model (Figure out how to do this, maybe pass a .py file with model definition)
    enable_profiler: true

  - name: "large_bf16_ln_linear"
    matrix_shapes:
      - name: "llama"  # Example of using LLaMa shapes
      - name: "pow2"  # Example of using power of 2 shapes
        min_power: 10  # 1024
        max_power: 12  # 4096
      - name: "pow2_extended"  # Example of using extended power of 2 shapes
        min_power: 10  # 1024
        max_power: 11  # 2048
      - name: "sweep"  # Example of using sweep shapes (commented out as it generates many shapes)
        min_power: 8   # 256
        max_power: 9   # 512
    high_precision_dtype: "torch.bfloat16"
    use_torch_compile: true
    torch_compile_mode: "max-autotune"
    device: "cuda"
    model_type: "linear"
    enable_profiler: true  # Enable profiling for this model

@jainapurva jainapurva requested a review from Copilot April 10, 2025 19:18
Copy link

pytorch-bot bot commented Apr 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2036

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (6 Unrelated Failures)

As of commit 8f73ebf with merge base d06b3e3 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 10, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

@jainapurva jainapurva added the topic: performance Use this tag if this PR improves the performance of a feature label Apr 10, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

@jainapurva jainapurva force-pushed the model_shapes_config branch from bcdb20c to bbcba36 Compare April 10, 2025 23:12
@jainapurva jainapurva marked this pull request as ready for review April 14, 2025 19:34
import torch.nn as nn


class ToyLinearModel(torch.nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you plan to use these for benchmarks they should be in torchao rather than testing. I think if someone pip installs torchao it doesn't install the test dir.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should probably mark as a TODO to conslidate all random model architecture definitions into one place. ToyLinear shows up all over hte place in our testing code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torchao.testing is built in the wheel. As discussed offline, we can define models in torchao.testing and use it everywhere in torchao, tests and benchmarks

@HDCharles
Copy link
Contributor

it looks like htere's more here than just model shapes config, should add more documentaiton to PR description about all the things that are being changed/updated.

Copy link
Contributor

@HDCharles HDCharles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anything expected to be used outside of tests should be in ao. imports in benchmarks to testing should have a really good justification.

@jainapurva jainapurva changed the base branch from main to bench-gpu-profiling April 15, 2025 19:41
@jainapurva jainapurva changed the base branch from bench-gpu-profiling to main April 18, 2025 17:47
@jainapurva jainapurva requested a review from HDCharles April 18, 2025 18:29
Copy link
Contributor

@HDCharles HDCharles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jainapurva jainapurva merged commit a724a37 into main Apr 23, 2025
12 of 18 checks passed
@jerryzh168
Copy link
Contributor

jerryzh168 commented Apr 23, 2025

sorry will need to revert this to fix diff train, we can't touch torchao/quantization/qat/embedding.py torchao/quantization/qat/linear.py for now

jerryzh168 added a commit that referenced this pull request Apr 23, 2025
jerryzh168 added a commit that referenced this pull request Apr 23, 2025
Revert "Model shapes config (#2036)"

This reverts commit a724a37.
@jainapurva
Copy link
Contributor Author

sorry will need to revert this to fix diff train, we can't touch torchao/quantization/qat/embedding.py torchao/quantization/qat/linear.py for now

It was an automatic lint fix, as the ruff tests were failing. I can reland without touching those two files.

@jerryzh168
Copy link
Contributor

@jainapurva should be fine now, diff train is restored

@jainapurva jainapurva mentioned this pull request Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: performance Use this tag if this PR improves the performance of a feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants